Tag
1 article
Meta and Stanford researchers introduce the Fast Byte Latent Transformer, reducing inference memory bandwidth by over 50% without subword tokenization.